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Abstract — This article is concerned with decentralized sequen- 
tial testing of multiple hypotheses. In a sensor network system 
with limited local memory, raw observations are observed at 
the local sensors, and quantized into binary sensor messages 
that are sent to a fusion center, which makes a final decision. 
It is assumed that the raw sensor observations are distributed 
according to a set of M > 2 specified distributions, and the 
fusion center has to utilize quantized sensor messages to decide 
which one is the true distribution. Asymptotically Bayes tests 
are offered for decentralized multihypothesis sequential detection 
by combining three existing methodologies together: tandem 
quantizers, unambiguous likelihood quantizers, and randomized 
quantizers. 

I. Introduction 

As a subfield of signal detection or hypothesis testing, 
multihypothesis sequential detection has many important en- 
gineering applications such as target detection in multiple- 
resolution radar, serial acquisition of direct-sequence spread 
spectrum signals and fault detection, see Baum and Veeravalli 
(T). The centralized version has been studied in both statistical 
and engineering literature, see the award winning papers 
by Dragalin, Tartakovsky and Veeravalli J4), and their 
references for the latest development. 

In recent decades the decentralized version of signal detec- 
tion or hypothesis testing has gained a great deal of attention, 
partly because geographically distributed sensors have been 
employed into a wide range of areas like military surveillance 
lUTl . target tracking and classification J8), and data filtering 
[18 1, etc. In the decentralized version, it is standard to assume 
that raw observations are observed at the local sensors, and 
quantized into sensor messages that are sent to a fusion center, 
which makes a final decision. Unfortunately, most research 
on decentralized detection deals with the off-line setting and 
research for the online or sequential setting is rather limited. 
To the best of our knowledge, so far existing research on de- 
centralized sequential detection is restricted to two-hypothesis, 
see Veeravalli, Basar, and Poor |[T5l , Veeravalli [14| and Mei 

flU. 

The goal of this paper is to develop asymptotic optimality 
theory for decentralized sequential detection when there are 
M > 2 possible hypotheses on the models of the sensor 
network system. A main challenge is how to find good quan- 
tizers at the local sensors so that the fusion center is able to 
utilize quantized sensor messages to make effective decisions. 
Intuitively, the choice of good quantizers should depend on 
the true unknown distribution of raw sensor observations. 
Since there are M > 2 hypotheses, it is expected that 



stationary quantizers will not lead to (asymptotically) optimal 
tests no matter how clever one chooses it. It turns out that 
by combining three existing methodologies together: "tandem 
quantizers" in Mei [10], "unambiguous likelihood quantizers" 
(ULQ) in Tsitsiklis fl3l . and randomized quantizers, we are 
able to find good quantizers and use them to offer a family of 
asymptotically Bayes tests for decentralized multihypothesis 
sequential detection. 

The remainder of this article is organized as follows. Section 
II provides a formal mathematical formulation of decentralized 
sequential multihypothesis testing problem and introduce the 
notation of randomized quantizer. Section III discusses tandem 
quantizers and constructs a family of "two-stage" decentralized 
sequential tests. This leads to a natural definition of "maximin 
quantizers," in which the corresponding two-stage decentral- 
ized sequential tests are shown to be asymptotically Bayes. In 
Section IV, the maximin quantizers are characterized in more 
details as a randomized quantizer based on at most M — 1 
(deterministic) ULQs, and numerical algorithms are provided 
to solve them explicitly. Section V provides specific examples 
to illustrate the method developed in previous sections. 

II. Notations and Problem Formulation 

Fig. Q] shows a widely used configuration of sensor net- 
works, where a fusion center is associated with a set of remote 
local sensors Sx, . . . , Sk- To highlight our main ideas, we 
assume K = 1 here, since the extension to systems with 
multiple sensors is relatively straightforward as long as the 
sensor observations are independent from sensor to sensor 
conditioned on each hypothesis. The local sensor takes a 
sequence of independent and identically distributed (i.i.d.) raw 
observations X\ , X^ , • • • over time n. In the decentralized 
version, it is assumed that the fusion center has no direct 
access to the raw sensor data X n 's due to communication 
constraints. Rather, the local sensor compresses X n into 
quantized message U n £ {0,1,...,/— 1}, and sends it to 
the fusion center, which will then use the J7„'s as inputs to 
make a final decision. For our purpose, we also assume that 
the fusion center can send feedbacks V n -i to local sensor so 
that the local sensor can adaptively adjust sensor policies to the 
optimal one. For simplicity, we further assume the quantized 
messages to be binary, i.e., U n € {0, 1}. 

Mathematically, at time n, the sensor message U n and 
fusion center feedback V n -\ can be defined as 

U n = 4> n {X n ; V n -i) G {0, 1}, V n -i = V>n(^[i,n-l])j 




Fig. 1. A Sensor Network 



where J7n n _i] = (Ui, . . . , C/„_i). Note that the feedback 
V n -i should only depend on the past sensor messages. Here 
no restrictions are imposed on V n -i, but it turns out that 
log 2 (M)-bit feedbacks will be sufficient to construct asymp- 
totically optimal tests under our setting. 

In decentralized multihypothesis sequential detection, it is 
assumed that there are M > 2 hypotheses regarding the true 
probability distribution P of X n 's: 



H„ 



(1) 



for m = 0, 1, . . . , M — 1, where the X n 's have a probability 
density (or mass) function f m (x) under P m . Furthermore, the 
sensor network system will continue taking observations until 
the fusion center believes that there is sufficient evidence from 
the quantized messages U n 's to make a final decision. That is, 
at a stopping time N, the fusion center makes a decision D G 
{0, 1, ... , M— 1}, where {D — to} means that one accepts the 
hypothesis H m . Here we emphasize that the decision {N = 
n} only depends on the first n sensor messages, i.e., N is a 
stopping time with respect to the filtration {F n = c{/7r 1 n i}} 
and D is measurable to Fm- 

In summary, a decentralized sequential test 8 includes a 
sequence of quantizers <fi n at the local sensor, a sequence of 
feedback functions tp n , a stopping time N at the fusion center 
and the final decision D. 

As in Wald ifTTI and Veeravalli et al. fl5l . we consider the 
Bayes formulation of decentralized multihypothesis sequential 
detection. Let c > be the cost of data sampling per time step, 
and W(m,m') be the loss of making decision D = m! when 
the true state of nature is P m . We assume that all W(m, m')'s 
are non-negative and W(m,m') = if and only if m = ml . 
Let the total risk of a test 5 when the true state is m be 

K c (5; m) = cE m N + ^ W(m, m')P rn [D = to']. 

m' 

Assigning prior probabilities ir = (ttq, - ■ ■ , ttm-i) to 
Ho, ••• ,Hm-i, define the average risk of a decentralized 
sequential test 6 as 



(2) 



The Bayesian formulation of decentralized sequential detec- 
tion problems can be stated as follows: 

Problem (PI): Minimize the 1Z C (8) in (f2]l among all possible 
decentralized sequential hypothesis testing procedures S. 

Let 5g(c) denote a Bayes solution to (PI), i.e., Sg(c) = 
a.rgmm s {lZ c (8)}. Unfortunately, the exact form of 5g(c) is 
too complicated to be tractable for multihypothesis sequential 
detection even for the centralized version, see, for example, 
Dragalin, Tartakovsky and Veeravalli Q. This leads us to 
consider the "asymptotic optimality" approach as follows: 

Problem (P2): Find a family of decentralized sequential tests 
{6(c)} such that 

lim n c (5* B (c))/n c (S(c)) = 1, 

c— >0 

where c is the unit cost in (fJJ. 

Problem (P2) is meaningful in application because it is often 
the case that the cost of doing a round of sampling is much 
smaller than that of making an incorrect decision. 

In the remainder of this section, let us discuss the concepts 
of randomized quantizers and Kullback-Leibler (K-L) diver- 
gences. Denote by $ the set of deterministic quantizers that 
consists of all measurable functions from R to {0, 1}. For a 
quantizer 6$, let f m (-;<fi) denote the induced distribution 
of the quantized data 4>(X n ) under P m , i.e., for u 6 {0, 1}, 
f m (w, 4>) = Pm(^ti) = u). Recall that the K-L divergence 
of 4> of any state m against any other state to' ^ m is defined 
as 

fm{u;4>) 



1 



/(to, m'; 4>) = y ] f m (u; (f>) log 



fm'{u;4>)' 



Now define a "randomized quantizer" <f> — ^ptft as a 
probability measure that assigns certain masses on an 

at most countable subset {0-'} C Denote by $ the set of 
all quantizers, deterministic or random. Note that a determin- 
istic quantizer can be thought of as a randomized one that 
assigns probability one to itself. For a randomized quantizer 
<f> = ~^2,p><f)i, define its K-L divergences as the weighted 
average of those of the deterministic ones it randomizes: 



/(to, m'; 
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p 3 /(to, m';(fP). 



This divergence for randomized quantizer will be key to our 
theorems. 

The following assumption ensures basic regularities of the 
pdf's, it will be imposed throughout the rest of the paper. 



Assumption 1. For any two states < m ^ ml < M 

fm{X\ 



1, 



log- 
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III. Our Proposed Test 5a (c) 

In this section we will use tandem quantizers to define a 
class of "two-stage" tests 5(c), and show that asymptotic Bayes 
tests can be found within it. The intuition is that the fusion 
center first makes a guess about the true state of nature and 
then tries to optimize the test based on the guess. 



As discussed in Mei iflOl , tandem quantizer denotes the 
case when each sensor has the choice between two different 
sensor quantizers with at most one switch between them. 
Obviously, a tandem quantizer is the simplest non-stationary 
quantizers from the viewpoint of the number of switches. For 
the purpose of defining the two-stage sequential test 5(c), a 
useful alternative way to think about tandem quantizers is to 
divide the decision making into two stages. 

In the first stage of 5(c), one can use whatever reasonable 
stationary quantizers to make a preliminary decision on which 
of the Al hypotheses is likely true, and the only requirement is 
that the sample size of this stage is large but is small relative to 
the overall sample sizes (or that of the second stage). Specif- 
ically, as c — >• 0, consider a sequence of u(c) € (0, 1/2) such 
that u(c) —> and logw(c)/logc — > 0, e.g., u(c) = l/\ logc|, 
and assume there is a quantizer 0° e $ such that for any 
< to, to' < M - 1, 

I(m,m';ft) > 0. (3) 

Now in the first stage, the local sensor uses the stationary 
quantizer ft 1 to send sensor messages to the fusion center, 
which will then face the classical multihypothesis sequential 
detection problem based on the i.i.d. quantized sensor mes- 
sages 4>°(X n ). Hence, one can recursively update the posterior 
distribution (iro,n, • • ■ , KM-\,n), n = 1,2, . . . at the fusion 
center as follows: 

7f m . n — 1 

Tfm.n — ^ f (tt . in\ ' { ' 

where U n is the quantized message at time n. As a reasonable 
test for the preliminary decision, the fusion center will stop 
the first stage at time A^ : 

N° = minjrt > 1 : max Wm ni > 1 — u(c)}, 

0<m<M~l 

and decides that the preliminary decision D° of the most 
promising state of nature is 

D° = argmax 7T m ^o. 

0<m<M-l 

In the second stage of our two-stage test 5(c), the local 
sensor switches to another stationary (though likely random- 
ized) quantizer, whose choices will likely depend on the 
preliminary decision D° of the first stage. Denote the quantizer 
used in the second stage as <f> m when D° = to, where 
to = 0,1,..., M -1. 

In the second stage, with the new quantizer applied at 
the local sensor, the fusion center starts afresh to update 
the posterior distribution (tto j71) . . . , nM-i,n) based on i.i.d. 
sensor messages in the second stage. An efficient stopping 
rule for the fusion center can then be found as in Dragalin 
et al. H as follows. Let r m . n = ^ m ,_£ m ir m ' ; „f(m» be 
the average loss by making a decision to at time n, and let 
r' mn = min m /^ m it m>n W(m, m!) be the least value of loss 
by making some decision to' ^ m at time n while to is the 



true state of nature. Define a total of M stopping times: 
r' 1 

N m = {n>N°: -2* > -}, to = 0, 1, • • • , M - 1. (5) 

The fusion center can stop the second stage (hence the whole 
procedure) at time N — min{A TO :0<to<A/ — 1}, and 
makes a final decision D = to if N = N m . 

It is worth discussing the implementation of the likely 
randomized quantizer <\> m if D° = m is the preliminary 
decision. We also need to give a explicit formula for updating 
posterior when randomized quantizer is used to form reports. 
Suppose 4> m = ^2p>c/P. The key of any allowable random- 
ization schemes is that the fusion center must know which 
deterministic quantizer is finally chosen, otherwise it may lose 
significant information and compromise the decision making 
efficiency. We propose two alternative ways to achieve this 
goal. The most straightforward way is to let the fusion center 
do the randomization directly. Specifically, at a time step n 
of the second stage, the fusion center selects a deterministic 
quantizer randomly according to the probability measure 
and informs the local sensor its choice through a feed- 
back. Meanwhile, the posteior distributions should be updated 
as follows: 

_ ^m,n-lfm{Un', ft) 

7Tm,n — y-, „ W 

ZjO<m'<M-l n m',n-ljm' \yni 9 I 

An alternative way of randomization is to implement a "block 
design" at local level. Suppose that cf> m is randomized by a 
finite number, say J, of deterministic quantizers, and b is a 
common denominators of the rational probabilities p 1 , . . . ,p J . 
Then take "blocks" of b observations, and in each block <f> 
<j) J are used following a fixed order such that each ft appears 
exactly bp^ times. In this way the fusion center also knows 
which quantizer is used at each time step and it will update 
the posterior just as in (0. 

For our proposed two-stage procedure 5(c) , its asymptotic 
properties are summarized in the following theorem, whose 
proof is omitted since it can be derived along the same lines 
as those in Section V of Kiefer and Sacks (7). To state the 
theorem, first we define the following information number for 
a quantizer <f> £ $ and state m = 0, 1, . . . , M — 1: 

I(m;ft)= min I(rn,m ; ft). (7) 

Theorem 1. Let {(f> m : m = 0,1, ... ,M — 1} be the 

randomized quantizers applied in the second stage of 5(c), and 
each (j) m randomizes finite number of deterministic quantizers. 
Suppose /(to; (j> m ) > 0, 7T m > for any to. Then as c — > 0, 
for the sample size N: 

E m [N] = (l+o(l))|logc|//(TO;0 m ), m = 0,l,...,M-l, 

(8) 

and for the probability of incorrect decisions: 

P m [D^m] = 0(c), to = 0,1,..., A/ -1. (9) 



Thus, the Bayes risk of the proposed two-stage test 8{c) is 
given by 

ft c (<5) = c|log C |(l + (l)) J2 TT~~a V (10) 

In light of Theorem Q] from the asymptotic viewpoint, 
an optimal procedure within the class of two-stage tests 
should maximize the information numbers I(m; <j> m ) so as to 
minimize the Bayes risk. This leads to a natural definition of 
the optimal quantizers that we should use in the second stage: 

Definition 1. For m = 0, 1, . . . , M — 1, the quantizer 4>™ x is 
defined as the maximin quantizers with respect to P m if 

C M = argsup(/(m;0)). 

Let us focus on the two-stage procedure 5 a (c) with the 
maximin quantizers being applied on the second stage. In 
next section, we will show that each 0™ ax can be attained by 
randomizing at most M — 1 deterministic quantizers. Hence 
by Theorem [T] it has a Bayes risk 

K c (5 A (c)) = (l + o(l))c\\ogc\Y / ^ (ID 

z — ' I[m) 

as c — > 0, where I(m) — sup^g^ I(jn; </>). 

Surprisingly, test 6 a (c) is not only the best among the 
two-stage tests, but also an asymptotically Bayes solution to 
problem (P2). This is a direct consequence of the following 
important theorem: 

Theorem 2. Relation ([77} is also satisfied by 8* B (c), the Bayes 
procedure. 

Proof: The conclusion will be established once we prove 
the following: for any test with the probability of making 
incorrect decisions P m (D ^ m) = 0(c log c) for m = 
0, 1, . . . , M — 1, its expected values of the total time steps 
must satisfy E m N > (1 + o(l))\\ogc\I(m)~ 1 for any state 
m as c — > 0. However this can be proved in the same way as 
Theorem 1 of Tsitovich fl2l . ■ 

It is useful to point out that although the stopping rules of 
the asymptotic Bayes test 8 A {c) involve the prior distribution 
{7Tm}'s, this is not essential and the key is for the local sensor 
to use the maximin quantizers 0™ ax 's at the second stage. In 
fact, since the maximin quantizers does not depend on the 
prior distribution {7r m }'s, © and © show that the optimality 
of 5 A (c) is robust w.r.t. a priori distribution {7r m } as long as 
its support covers all M possible states of nature. 

IV. Characterizing the Maximin Quantizers 

In this section, we provide a deeper understanding of the 
maximin quantizers {</>™ ax : m — 0,1, . . . , M — 1} and also 
illustrate how to compute them explicitly when the sensor 
messages are binary. For this purpose, we first introduce 
the concept of the unambiguous likelihood quantizer (ULQ), 
which was proposed in Tsitsiklis lfl3l as a generalization of 
Monotone Likelihood Ratio Quantizer (MLRQ). 



For simplicity, we assume that for any set of real numbers 
{a m i : < m' < M — 1} which are not all zeros, 

P m {J2 am>f m >(X) = 0) = 0, < m < M - 1. (12) 

m' 

Note that (fT2l is easily satisfied by the common continuous 
pdf families like normal, exponential, etc. 

Definition 2. Under A12\l , a deterministic quantizer <f> G $ is 
said to be an unambiguous likelihood quantizer if there exist 
real numbers {a m : < m < M — 1} which are not all zero, 
such that 

<P(X)=l(£2a m f m {X)>0), 

m 

It is easy to see that in the case of binary simple hypothesis 
testing, i.e., M = 2, the ULQs become MLRQs. 

With the definition of ULQs, now it is time to state the 
following useful theorem which characterizes the maximin 
quantizers {0™ x }. 

Theorem 3. Under ( 1721 ), each maximin quantizer <fi"™ x can be 
attained as a randomization of at most M — 1 ULQs. 

The detailed proof involves tedious technical details, and 
thus here we will only provide a high-level short explanation. 
For a fixed state m, finding the maximin quantizers against 
the other M — 1 states is equivalent to solving an optimization 
problem in an M — 1 dimensional space, where each quantizer, 
deterministic or randomized, corresponds to a point in it. By 
Tsitsiklis [13|, these points construct a convex region whose 
extremal points all correspond to ULQs under the condition 
of dl21 l. Moreover, the maximin quantizers correspond to the 
points that must be on the surface of the convex region, 
and thus can be expressed as a convex combination of at 
most M — 1 extremal points (see Hormander |6|). Combining 
these results together leads to the desired relation between the 
maximin quantizers and the ULQs. 

With Theorem [3] we are ready to illustrate how to find the 
maximin quantizers numerically. 

Fix any state m, define M 2 — 1 parameters as probability 
masses {p> m : 1 < j < M - l,p J m > 0,J2jPii = !}> and 
ULQ coefficients {a^ m , : 1 < j < M — 1,0 < m' < M — 

l'Sm'( a mm') 2 = I}- B ase d on every combination of these 
parameters, define by cf> the quantizer randomizing M — 1 
ULQs: 4> = Ylfj^vL&Li where 

<t4 n (X)=I(J2aL tm ,f m >(X)>0). 
m' 

The maximin quantizer 0™ x can then be found as cf> that 
maximizes 

min/(m, /; (f>), (13) 
among all possible combinations of {pi, t ;a J m m ,}. 



V. Examples 
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In this section we illustrate our procedure with a concrete 
example. Suppose that the raw sensor observations X n 's are 
distributed according to 7V(/i, 1). If there are only M = 2 
hypotheses on fi, say testing Ho : /i = against Hi : fx = 1, 
then there is no randomization involved in the second stage, 
and the maximin quantizer is just the ULQs which becomes 
the MLRQs when M = 2. Such a result is consistent with 
those in Mei iflOl . 

Now suppose there are M = 3 hypotheses regarding the 
normal mean: Ho : /i = 0, Hi : /i = —1, and H2 : \i = 1- For 
this specific case, it is not too difficult to solve the optimization 
problem dT3l by linear programming. Up to the precision of 
four decimal places, numerical computations show that all 
three maximin quantizers turn out to be deterministic ones: 
(f> = I(X > 0), 0i = I(X > -0.7941), and qb 2 = 
I(X > 0.7941), and their corresponding maximin information 
numbers are 1(0) = 0.3137 and 7(1) = 1(2) = 0.3186. 
For the first stage, the quantizer <fio can be applied because 
it satisfies the condition (01. By Theorem Q] the risk of 5a(c) 
can be approximated by 

K c (S A (c)) = c\ logclfl + o(l))(-^— + %1 +7T2 ). 
c\ a\ )> m 0.3137 0.3186 ' 

As a comparison, in the centralized version when the whole 
raw observations are allowed to be used at the fusion center, 
it can be shown that the Bayes risk of the optimal centralized 
test is 

ftc(Wc)) =2c|logc|(l + o(l)), 

see, for example, Dragalin et al. J3J and Kiefer and Sacks (7). 
Thus the asymptotic efficiency of Sa(c) with respect to the 
optimal centralized test is 

limn c (S cen (c))/n c (S A (c)) > 2/(1/0.3137) = 0.6274. 

c— >0 

In particular, if we just merely introduce another identical 
sensor into the network system, then the efficiency of 8a(c) 
will be doubled and the corresponding decentralized test will 
have better properties than that of <5 cen (c). 

VI. Conclusion 

In this article, the problem of decentralized testing mul- 
tihypotheses in (single) sensor networks is studied. Asymp- 
totically Bayes test {<5^(c)} is constructed by combining the 
ideas of "tandem quantizers", "unambiguous quantizers", and 
"randomized quantizers." Such a test involves a new concept 
of maximin quantizers which are discussed in details, both 
theoretically and numerically. 

It is natural to extend our results to the networks with 
multiple sensors, where different sensors may use different 
quantizers. A more interesting extension is to understand what 
happens when one or more hypotheses are not simple, i.e., 
the composite multihypotheses case. These will be reported 
elsewhere. 
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